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Abstract 

We present a study on lookahead hierarchies for restarting automata with auxihary symbols 
and small lookahead. In particular, we show that there are just two different classes of languages 
recognised by RRWW automata, through the restriction of lookahead size. We also show that the 
respective (left-) monotone restarting automaton models characterise the context-free languages 
and that the respective right-left-monotone restarting automata characterise the linear languages 
both with just lookahead length 2. 

1 Introduction 

Restarting automata work in phases of scanning their input from the left end marker towards 
the right end marker, rewriting the lookahead contents with a shorter substring once per phase, 
and then restarting at some point before or at the right end marker. They were introduced to 
model the analysis by reduction grammar verification technique in the analysis of sentences in free- 
word order natural language. It has been shown that through various restrictions on the model, 
an important number of traditional and new formal language classes may be defined. Study of 
restarting automata has therefore also become important for both its original intent of computational 
linguistic application development, as well as for being an alternative machine model for investigating 
properties of traditional and newly distinguished formal language classes. 

In his study of lookahead hierarchies, Mraz [3] showed that the expressive power of restarting 
automata without auxiliary symbols increases with the size of the lookahead. Schluter [6] later 
showed that for deterministic monotone and monotone restarting automata with auxiliary symbols, 
separation of rewrite and restart step is not a significant restriction on expressive power for any fixed 
lookahead size A; > 3, and that for the deterministic model, the difference in power of the models 
can be overcome by approximately doubling the lookahead size, when > 3. In both studies, it was 
remarked that lookahead hierarchies collapse for (left-)mon-RWW and (left-)mon-RRWW automata 
to A: = 3. This paper presents a study on lookahead hierarchies for k < 3 of restarting automata with 
auxiliary symbols. In doing so, we also establish lookahead hierarchies for the most general model 
of restarting automata, for any k. In particular, we show that there are only two different classes of 
languages recognised by RRWW automata, through restrictions on lookahead size. 

We also partially improve a result from [6] and [3], by showing that the respective monotone 
and left-monotone restarting automaton models characterise the context-free languages with only 
lookahead size 2. And, we establish a corresponding result for the characterisation of the linear 
languages by the respective right-left-monotone restarting automata with lookahead size 2. 

Following the definition of restarting automata and presentation of some useful properties in 
Section [21 we present our main results in Section [3l 



*This is the full version of the paper accepted at LATA 2011. 
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Some notation. We refer to the ith symbol of a string x as x[i], and its STibstring from the ith 

to jth symbols as x[i,j]. When we want to make the length of a string v such that \v\ = k explicit, 

we may refer to v as f [1, /c]. 

For i,j G N, with i < j, alone denotes the set {i,. . . If i = 1, we say [7] := 

If 5 is a set of symbols, then by we denote the set of strings of length i G N with symbols 

from S. Also A := S"*^ is the empty string. 

Finally, REG, LIN and CFL denote the classes of regular, linear, and context-free languages 

respectively. 

2 Preliminaries 

A restarting automaton or RRWW-automaton, M = (Q,T,,T,(^,$,qo,k,6), is a nondeterministic 
machine model with a finite control unit and a lookahead (or read/write) window of size k (including 
the symbol under its scanning head, which is the first symbol of the lookahead contents) that works 
on a list of symbols delimited by end markers (or sentinels) ({$, $}), where ct is the left sentinel and 
$ is the right sentinel. S is the input alphabet and F D S the work tape alphabet. The symbols 
F — S are called auxiliary symbols. Q is the finite set of states and qo e Q is the initial state. 

M's transition relation, 5, describes four types of transition steps (or instructions), where u is 
the contents of the lookahead. 

(1) A move-right step is of the form q' G d{q,u), where q,q' G Q. This means that M advances 
one tape square to the right and enters state q' upon reading u. 

(2) A rewrite step is of the form (g', REWRITE(u)) G 5{q,u), where q,q' G Q, and v is such that 
\v\ < \u\ {u,v G F*). This means that M replaces its window contents u with v, advances to 
the tape square directly to the right of v, and enters state q' . In this rewrite instruction, we 
will refer to it as the redex and v as the reduct. 

(3) A restart step is of the form RESTART G 5{q,u), where q G Q, in which M moves its read/write 
window to the beginning of the input and enters the initial state. 

(4) An accept step is of the form ACCEPT G 6{q, u), in which M halts and accepts. (This may also 
be viewed as the accept state.) 

If 5(q,u) = 0, in which case we say that S is undefined, M halts and rejects; we could exclude this 
possibility through the use of a model with both accept and reject states, in which case all possibilities 
for 5 are defined. If |(5(g, n)| < 1 for all q,u, then the restarting automaton is deterministic. 

A configuration of M is uqv, where u G {A}U{ct;}-F* is the contents of the worktape from the 
left sentinel to the position of the head, q € Q is the current state and v G{(t;,A} • F* • {$, A} is the 
contents of the worktape from the current first symbol under the scanning head to the right sentinel, 
and uv is the current contents of the worktape. The head scans the first k symbols of v (or all of v 
when \v\ < k). A restarting configuration, for a word u; G F*, is of the form qQ(tw$. If w G S*, qo(tw$ 
is an initial configuration. An accepting configuration is a configuration with an accepting state. 

A computation of M for an input word G S* is a sequence of configurations starting with an 
initial configuration, where two consecutive configurations are in the relation \-m induced by a finite 
set of instructions of one of the above mentioned types. The transitive closure of \-m is denoted h^. 
A phase of a computation begins with a restarting configuration and (exclusively) either (1) ends 
with the next encountered restarting configuration, in which case it includes exactly one rewrite 
step and is called a cycle, or (2) halts, in which case it includes at most one rewrite step and is 
called a tail phase. We refer to segments of a computation within a single phase before (resp. after) 
a rewrite as left (resp. right) computation. 
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An input word w is accepted or recognised by M if there is a computation which starts on the 
initial configuration and finishes in an accepting configuration. Also, we define C{M) as the language 
recognised by M. 

Consider a cycle C and say the configuration from which M carries out a rewrite step is uqv 
in C; we define to the right distance of C as Dr{C) := \v\ and the left distance as Di(C) := \u\. 
Let C = Ci,C2, - ■ ■ ,Cn be a sequence of cycles of a restarting automaton M that, together with 
possibly a (final) tail phase, are M's computation on some input. If Dr{Ci) > Z)r(C'i+i) for all 
i € [n — 1], we say that C is right-monotone or simply monotone. Similarly, if DiiCi) > Z)i(Cj+i) 
for all i € [n — 1], we say that C is left-monotone. If C is both right- and left-monotone, then we 
say that C is right-left-monotone. If all the sequences of cycles corresponding to computations of 
a restarting automaton M are monotone (respectively left-monotone, right-left-monotone) then we 
say that M is monotone (respectively left-monotone, right-left-monotone). We denote the class of 
monotone RRWW-automata (respectively left- or right- left-KRWW automata), mon- RRWW {left- 
mon-RRWW or right-left-mon-KRWW) . 

Through restrictions on the restarting automaton model, we obtain many types of restarting 
automata. For instance, RRW-automata are RRWW-automata with no auxiliary symbols (F = S). 
An RR- automaton is an RRW- automaton with rewrite instructions that can only delete symbols. 
An RWW-automaton is an RRWW-automaton, which restarts immediately after any rewrite in- 
struction, and an RW-automaton is an RRW-automaton that restarts immediately after any rewrite 
instruction. Finally, an R-automaton is an RR-automaton that restarts after any rewrite instruction. 

When the rewrite and restart steps are not separated, instead of items (2) and (3) in the de- 
scription of 6 above, we have simply the following type of instruction. 

(2/3) A rewrite step (which is combined with restarting) is of the form REWRITE(f) G 5{q,u), where 
q,q' € Q, and v is such that \v\ < \u\ {u,v £ F). This means that M replaces its window 
contents u with v and then moves its read/write window to the beginning of the input and 
enters the initial state. 

All notions of monotonicity and determinism and corresponding notation extend to these more 
restrictive versions in the obvious way. 

An X automaton, X G {R, RR, RW, RWW, RRW, RRWW}, with lookahead size k, will be 
denoted by X{k). For example, an RRWW(k) automaton is an RRWW automaton with lookahead 
size k. 

2.1 Restarting Automaton Specification by Regular Constraints 

Niemann and Otto [4j describe the behaviour of a non-deterministic restarting automaton M by 
means of a finite set of meta-instructions of the form {Ei,u v, E2) (called cycle meta-instructions) 
and (£■, ACCEPT) (called tail meta-instructions). In these meta-instructions, Ei,E2, and E are reg- 
ular languages, which are called the regular constraints of the meta-instruction, and u and v are 
strings such that u v stands for a rewrite step of M, where u is the redex and v is the reduct. 
These meta-instructions are applied as follows. In a restarting configuration qQ(iw$, M nondetermin- 
istically chooses a meta-instruction, say {Ei,u — )• ^,£"2). Now, if w does not admit a factorisation 
of the form w = W1UW2 such that ctwi G Ei and W2S G E2, then M halts and rejects. Otherwise, 
one such factorisation is chosen nondeterministically, and qo(\^w$ is transformed into the restarting 
configuration qo(\:wivW2$. If {E, ACCEPT) is chosen, then M halts and accepts, if ctifS G E, otherwise, 
M halts and rejects. Similarly, the behaviour of an RWW-automaton M can be described through 
a finite sequence of meta-instructions of the form {E,u — )• v) and (£■, ACCEPT). 

2.2 Four Useful Properties 

This section presents four basic lemmata used in the proofs of the main results in Section [31 
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The correctness preserving property is a fundamental property of restarting automata. 

Proposition 1 (Correctness Preserving Property [S]). Let M be a restarting automaton, and u, v 
be arbitrary input words from S*. If u G C{M) and u \-\^ v is an initial segment of an accepting 
computation of M , then v £ C{M). 

It will be useful to simplify the computations of the restarting automata that we discuss (without 
reducing their power). The next three lemmata serve this purpose. 

A nondeterministic restarting automaton M = ((5,S,r,(t, $, qQ,k,5) is in RR-semidet-form if 
(1) halting (and restarting for automata with separate rewrite and restart steps) occurs only when 
the right sentinel is under the lookahead, and (2) move-right steps are deterministic. The following 
lemma shows that non-deterministic restarting automata with lookahead length k can be assumed 
w.l.o.g. to be (1) in RR-semidet-form and (2) making move-right steps based only on the first symbol 
under the lookaheadjl] 

Lemma 2. For any X-Y automaton, Mi = ((3,I],r,(f, $, qQ,k,6), where X € {(right-left-, left- 
)mon-,\} andY e {R, RR, RW, RRW, RWW, RRWW}, there is X-Y automaton, Ma = {Q',J:,T,^, 
$, qQ,k,6'), such that 

1. M2 is in RR-semidet form, 

2. M2 makes move-right steps based on the couple {u[l],q), where u[l] is the first symbol under 
the lookahead and q is M2 's current state, 

and C{Mi) = C{M2). 

Proof. Jancar ^ showed (1). (2) is easily seen by the specification of non-deterministic restarting 
automata by means of regular constraints. A restarting automaton specified by regular constraints 
can easily be assured to be in RR-semidet-form. Halting (and restarting for automata with separate 
rewrite and restart steps) can be made to occur after verification that the tape contents can be 
factorised according to the selected meta-instruction and once the automaton reaches the right 
sentinel. Moreover, move-right steps verify membership in a regular language, so not only can these 
move-right steps be determinised, but they can be determinised based on just the first symbol under 
the lookahead. Any monotonicity is preserved. □ 

If a restarting automaton M only rewrites when the contents of its lookahead is full, we say that 
M has fixed rewrite size. 

Lemma 3. For any X-Y automaton. Mi, where X € {(right-left-, left-)mon-,X} and Y € {R, RR, 
RW, RRW, RWW, RRWW}, there exists an X-Y automaton, M2, that has fixed rewrite size, such 
that C{Mi) = C{M2). 

Proof. For the proof, we construct a restarting automaton M2 from Mi that never rewrites when its 
lookahead contains less than k symbols (where k is the length of the lookahead), supposing without 
loss of generality that Mi is in RR-semidet form. We describe the case where restart and rewrite 
steps are separated, the other case being easily understood from this. 

Mi's lookahead can only contain less than k symbols if it also contains the right sentinel. We 
rely on a simple speed-up of Mi's steps for the cases (1) where the left sentinel is also contained in 
the lookahead, (2) of a right computation, or (3) of a tail phase. 

Otherwise, Mi (with transition relation 61) has a rewrite of the form (p, REWRITE(r;$)) € 61 {q, u$) 
where < k. In this case, we "plug up" the rewrite from the left with all strings a G 

^Here, the decision whether or not to move-right remains non-deterministic; however, the decision of which move- 
right step to carry out becomes deterministic. 
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such that Ml from state reads au$ and enters state q with u$ the prefix of its lookahead, giving 
(p, REWRITE(q!i;$)) G S2{q',au$), where 62 is Af2's transition relation. 

Clearly C{Mi) = C{M2)- Also, monotonicity is clearly preserved. □ 

Lemma 4. For any X-Y automaton, Mi, where X € {(right-left-, left-)mon-,X} and Y G {RWW, 
RRWW}, with lookahead size k, there exists an X-Y automaton, M2, with lookahead size k, that 
reduces its input by only one symbol per cycle, and is such that C{Mi) = £(M2). 

Proof. Let Mi = {Q,T,,T,<t, $, qo,k,6i) be an X-RRWW automaton where X G {(right-left-, left- 
)mon-,A}, with fixed rewrite size, in the RR-semi-det form, and that carries out move-right steps 
based on only the first symbol under the lookahead. Let B be a symbol not in T, which we call the 
blank symbol. We construct M2 = (Q U Q U Q, S, F U {B},<t, $, qo, k, 62), such that >C(Mi) = £{M2), 
from Ml. 

In what follows, 

q, q',P,p' e Q, u G (r u M) ■ ■ (r u {$}), 

xG (ru{S})'=-2.(ru{S,$}), and X1X2 e {T u {B})''-^ . 

M2's state set includes Mi's state set (Q), marked states for indicating a guess that there are 
blank symbols on the tape (in left computations) Q '■= {q \ q & Q}, and hat states for indicating 
that M2 is working in a right computation, Q ■= {q \ q & Q}- 

In a restarting configuration, M2 can either rewrite or move-right. Say M2 wants to simulate a 
move-right step of Mi . M2 first guesses whether there are any blank symbols currently on its tape. 
If M2 guesses that there are blank symbols on it's tape, then it will move into a marked state. 
Otherwise it will remain in a state from Q. So, if q' G 6i{qo,u), then M2 has both of the following 



move-right instructions 

q' G S2{qo, u) for guesses that there are blank symbols on the tape, and (1) 

q' G S2{qo,u) for guesses that there are no blank symbols on the tape. (2) 
For rewrites, if (p, REWRITE(t;)) G 5i{q,u), then 

(p,REWRITE(B'=-1-|''Iz;)) G d2{q,u). (3) 



That is, we pad rewrites of Mi (from the left) with k — 1 — \v\ blank symbols so that the input is 
reduced by only one symbol for M2. (Note that if g = qo, since Mi has fixed rewrite size, we never 
pad these lookaheads.) The state p indicates that M2 has made a rewrite. There should be no blank 
symbols for the rest of this cycle (right computation). Therefore if M2 finds a blank symbol while 
in a hat state, it rejects: 

REJECT G S2{q, Bx), and REJECT G (52(g, xiBx2$). 

In subsequent cycles, M2 will delete the blank symbols introduced, one-by-one and immediately 
restart. Unless M2 is in a restarting configuration, it can only delete blank symbols if it is a marked 
state (i.e., if it guessed that there were blank symbols on the tape at the start of the cycle): 

REWRITE(a;) G 62{q, Bx), deletion of blank symbols in a marked state (4) 

REWRITE(x) G 52{qo, Bx), deletion of blank symbols in the start state. (5) 

If M2 reaches the right sentinel in a marked state, and still has no blank symbols under its 
lookahead, then it rejects (it has verified that its guess about the presence of blank symbols on the 
tape is incorrect): 

REJECT G S2{p, u[l, k - 1]$) Vn[l, A; - 1] G F^-\ 
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We have already defined move-right instructions for M2 in state qq. M2 can simulate Mi's move- 
right steps with only the first symbol under the lookahead. Therefore we can define the rest of 
M2's move-right steps simply as follows, for q' € 5i{q,u) and based on just the symbol u[l] of the 
lookahead (as well as the states q, q'). Here, neither q nor q' is the restart state. Also, x does not have 
the right sentinel as a suffix. If M2 is in a marked state (resp. hat state, state from Q) it remains in 
a marked state (resp. hat state, state from Q): 

q' e 62{q,u[l]x), q' e 62{q,u[l]x), and q' e 62{q,u[l]x). 

In state q or q and with lookahead contents u, M2 move rights and rejects (resp. accepts) if in 
state q, Mi moves right and rejects (resp. accepts). Also, it is clear that -C(Mi) = C{M2). Moreover, 
it is easy to see that monotonicity is preserved. □ 

For the remainder of this paper, we will assume w.l.o.g. that all discussed non-deterministic 
restarting automata with auxiliary symbols (1) are in RR-semi-det form, (2) carry out move-right 
steps based on the current state and the first symbol under the lookahead, (3) have fixed rewrite 
size, and (4) reduce their input by only one symbol per cycle. 



3 Main Results 

For restarting automata with auxiliary symbols and lookahead of size 1, showing that the separation 
of rewrite and restart step results in an increase in power for these automata. In fact, the result is 
given for monotone restarting automata alsoU 

Proposition 5. For X € {(right-left-, left-)mon-,X\, 

REG = C{X-RWW{1)) C C{right-kft-mon-RRWW{l)). 

Proof. Mraz [3] showed that REG = C{X-R{1)) = C{X-RW{1)) = C{X-RWW{1)), with X G {det- 
mon, det, mon, A} and this clearly also holds for X = (right-left-, left-)mon. We specify a right- 
left-mon-RRWW(l) automaton M such that C{M) G LIN — REG, through the following regular 
constraints. (Note that /:(right-left-mon-RRWW) = LIN [2\.) 

{(t{ab)*a, b-^ X, (cd)*$) {(i{ab)*a, c X, d{cd)*$) 

l(tlab)*,a ^ X,dlcd)*$) {dlab)* , d ^ X, {cd)*$) ((tA$, ACCEPT). 

By an enumeration of the left-over context possibilities, it can be shown that C{M) = {{ab)^{cd)^ 
n > 0} U {{abY-^a{cdY | n > 0} U {{abY-^ ad{cdY-^ | n > 0} U {{abY~^ aicdY'^ | n > 0} G 
LIN - REG. □ 

We can also separate the classes of languages recognised by RWW (RRWW) automata with 
lookahead 1 from that of those with lookahead 2. The result is also given for monotone restarting 
automata. 

Proposition 6. For all X G {(right-left, left-)mon, A}, 

C{X-RWW{2))- C{X-RWW{1))^$ and C{X -RRWW {2)) - C{X -RRWW {!)) ^ ^ . 

^Note that this is only a small improvement on the fact that £(X-RWW(1)) C £(RRWW(1)) for all X G {(right- 
left-, left-)mon-,A}, which is an immediate consequence of results in [3]. 
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Proof. The language L = {a^b"' | n > 0} is the classic example of a linear language that is not 
regular. A det-right-left-mon-RWW(2) automaton to recognise L may be specified (deterministically) 
by the following regular constraints: 

{(ia*,ab ^ c,b*$), {<ia*,cb ^ d,b*$), (cfa*, ad ^ c, 6*$), ((tA$, ACCEPT), (ctdS, ACCEPT). 

On the other hand, no restarting automaton M with just size 1 lookahead can recognise this language, 
for after the first deletion, the tape contents contain a string not in C{M), which is excluded by the 
correctness preserving property. □ 

It turns out that further separation of language classes for RRWW is not possible. This is the 
main result of this paper, given in Theorem [7] and Corollary 1121 

Theorem 7. For k >2 and X £ {(right-left-, left-)mon, A}, we have 

C{X-RRWW{k)) = C{X-RRWW{k + 1)). 

Proof of Theorem^ Assume Mi = (Qi, S, ri,(|;, $, q^^k + l,5i) is an RRWW(k+l) automaton. 
We construct M2 = ((52, 5], r2,ct;, $, qo,k,S2) an RRWW(k) automaton to simulate Mi, such that 
C{Mi) = C{M2). 

For this construction, the nondeterminacy of M2 is essential. M2's lookahead is one symbol 
shorter than Mi's. So, M2 will simulate Mi's rewrites by guessing the contents of the tape square, 
tr, following the last symbol of its lookahead, contained in tape square tl- It will verify this guess 
within up to one step (of the same cycle), using a compound state holding this information, leaving 
behind in the compound symbol tl, how M2 should read the guessed contents of tr in subsequent 
cycles; we'll call this instruction /. If there is a rewrite starting in tr in a subsequent cycle, Cj, 
then M2 will record in tr that it should ignore / in all cycles after Cj. Using the Matching Lemma 
(Lemma llip concerning the "interaction" of information in tl and tr, M2 will be able to determine 
which message is most up-to-date. Note that this simulation could not work for k = 1, because then 
Ml can only delete. 

We now give the formal proof of the Theorem. 

Notation for M2's Work Tape. Let Qt,c = T^i^i'^io'^iiT^ii ■ ■ ■ '^in-,n'^i„-m+i'^in-m+3 denote M2's 
work tape at time t in cycle Cm {m > 1) of computation C on an initial input of length n, where 
each TTi^ is a tape square boundary, for j € {—1,0} U [n — m + 3]. Further, with respect to Qt,c, 
we let Tji{Tri^,t) denote the contents of tape square to the right of vTj^. at time t (if it exists) and 
TiiT^ij , t) the contents of the tape square to the left of vrj^. at time t (if it exists). So, we always have, 
for example, TR{7r-i,t) =(\^= TLino,t). We call a tape square boundary internal if it is between two 
tape squares. With each cycle, one tape square and boundary are destroyed and for this proof, we 
say that the second tape square involved in the redex and its boundary to the left are destroyed in 
the rewrite of the cycle. 

Verification Information and Rewrite Instruction Set Notation. By verification informa- 
tion, Verinf , we will just mean some member of the set of Mi's rewrites, or the special blank symbol, 
B ^ r2, and we will denote the set of verification information as 

n := {{q, u[l, k + l],v[l, k],q') \ {q' , REWRITE(u[l, A;])) G 6i{q, u[l, k + 1])} U {B}. 

We'll also refer to Hi := 11 — {B} as the set of Mi's rewrites. For p = {q, u[l, k + l],v[l, k],q') G Hi, 
we denote to the components of p as follows: 

Tedex{p) := u, reduct(/9) := v, f rom_state(p) : q, and to_state(/9) := q'. 
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So, for example, reduct(p)[/c + 1] = u[k -\- 1] and redex(p)[A;] = v[k]. Finally, we denote by II2, the 
set of M2's rewrites, 

U2 := {{q, x[l, k],y[l, k-l],q')\ (g', REWRITE(2/[1, k - 1])) G 52((7, A;])} 

which will be defined shortly. 



M2's Tape Alphabet. M2 has tape alphabet F2 := Fi U A, where 

A := {{x, Verinf , ci, C2) | x € Fi, Verinf S H, ci, C2 € {0, 1, neutral}}. 

The second through fourth components of the information from these compound symbols in A 
are used for verifying rewrite guesses, updating tape contents, and determining whether updating 
is necessary. 

If Verinf = B, we say that Verinf is blank; we refer to the set of compound symbols with 
blank verification information as A^. Also, we refer to the set of compound symbols with the last 
component, C2, not equal to neutral as Aqi. 

M2 uses compound symbols as either the last and possibly also the first symbol of a reduct. The 
information Verinf is used for verifying rewrite guesses and updating tape contents; this component 
will be non-blank in the last symbol of a reduct. Verinf represents the latest simulated rewrite 
introducing a compound symbol in the tape square as the last symbol of the reduct. 

The last two components of the 4-tuples in A take values that help determine when verification 
information is out of date; the third component gives instructions about information in the following 
tape square and the fourth component gives instructions about information in the preceding tape 
square. Their usage will be made precise in Remark [5] and in the description of M2's rewrite and 
move-right instructions. 

To refer to the different components of compound symbols z = (z', Verinf , ci, C2) € A, we 
introduce the notation compj(z),z € {2,3,4}, which refers to the ith component of z. On the other 
hand, comp]^ is defined as a homomorphism comp]^ : F2 U {(t, $} ^ Fi U {(t, $} as follows, for z G 
F2U{e,$} 



coiiip]^(2;j 



if z G Fi U {e, $} 

if z = (x, Verinf, ci, C2) G A. 



Then we extend comp^^ in the natural way to comp]^ : (F2 U {ct, $})* (Fi U {(t, $})*. 

Further, we inductively define a mapping /i : (F2 U {X,(t}) x (F2 U {(t, $})* (Fi U $})* by 



compi(z) if z' G Fi U As U {$}, or 

ifz'GA — Ab,zG Aqi, and conip4(z) = comp3(2:'), or 
if z' = A. 
reduct(comp2(z'))[^] otherwise. 

Then we let h{z',za) := h{z' , z)h{z,a), where z is a single symbol. 

Since compound symbols may have various components in common, we will sometimes speak of 
components being introduced into tape squares. If at time t a tape square r holds compound symbol 
z with some component comp^(z), but at time t — 1, r's contents held some symbol z' G F2 without 
the same component — that is, either z' G Fi or conipj(2;') 7^ compj(2;) — then we say that conipj(z) 
was introduced (into tape square r) at time t. 



h{z' , z) = < 
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M2's State Set. For the definition of Q2, we first define the two-by-two mutuahy exclusive sets 
Q21 and Q22 (which are also each mutually exclusive with Qi). 

Q21 ■= {{q, Verinf , c, d, e) | g € Qi - {ACCEPT, REJECT}, Verinf G 11, 

c, e E {0, 1, neutral}, d € {verify, ignore, neutral}} 

Q22 := {qu[i,k] I q e A;] e (Fi U {e})'' and (5i(g,n[l, G {ACCEPT, REJECT}} 

M2 has the state set Q2 ■= Qi U (521 U (522, where Q22 is the set of all possible contexts leading to an 
accept state for Mi, used on exactly the accept step in M2's computations. The compound states 
(from Q21) are only used to "pick up" information from compound symbols. 

To refer the different components of compound symbols q = ((;', Verinf , c, d, e) G Q2i, we intro- 
duce the notation COMPi{q),i G {2, 3, 4, 5}, which refers to the ith component of q. We further define 
the homomorphism C0MPi((7) : Q2 ^ Qi as follows, for q G Q2- 

'q if g G Qi 
COMPi(g) ■.= lp if g = fc] G Q22 

if g = (p, Verinf, c,(i,e) G Q21- 
Using the mapping h above, we define another mapping g : Q2 x {^2 U {$, $})* (Ti U {(t, $})* by 

coinp^(z) if g G Qi, or 

if z G Aqi, and conip4(z) = C0MP3(g), or 
if z = {e,$}. 
reduct(C0MP2(g))[/i;] otherwise. 



Then we let g{q,za) := g{q, z)h{z,a), where z is a single symbol. 

The presentation of the proof is somewhat eased by first presenting some guiding properties 
for M2 that the definition of rewrite and move-right steps will have to obey; this is the purpose 
of Remark [8] (some comments on Remark [8] follow) . After this, we will prove some facts about M2 
based on these properties and use these results in the remainder of our definition of M2 that follows. 

Remark 8. M2 will be defined according to the six following invariants: 

(11) M2 's rewrites will be of the form {p, REWRITE{y[l, k — 1])) G 62{q,x[l, k]) where: 

(a) The last symbol of the reduct, y[k—\], is from A — and is such that comp2{y[k—l]) G Hi 
is the rewrite of Mi simulated. 

(b) The first symbol of the reduct, y[l\, is from Aqi U Fi. 

(c) All remaining symbols of the reduct, y[i],i G {2, . . . , A; — 2} are from Fi. 

(12) M2 will only write a symbol from Aqi if in a compound state. In particular, if M2 is in 
compound state q and writes symbol y G Aqi, then comp2{y) = C0MP2{q) and comp^{y) = 
COMPsiq). 

(13) M2 will always enter a compound state after carrying out a rewrite step. In fact, if M2 is 
in compound state q after writing compound symbol y[k — V\ G A — A^, then C0MP2{q) = 
comp2{y[k — 1]), COMP^^q) = comp^{y[k — 1]), C0MP4(q) G {verify, ignore}, and if x[k] G 
A — Ab, then COMP^{q) = comp-^{x[k]) , otherwise COMP^{q) = neutral. 

(14) M2 enters a compound state after reading a compound symbol from A — Ab as the first symbol 
under the lookahead. Otherwise, after a move-right step M2 must be in a state from Qi . In fact, 
if M2 reads symbol 2: G A, then it enters a compound state q such that C0MP2{q) = comp2{z), 
COMPsiq) = comp^{z), and COMPi{q) = COMP^{q) = neutral. 
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(15) M2 in compound state q with COMPj^{q) G {verify, ignore] rejects if it reads a compound 
symbol z ^ IS. such that COMP^^q) = comp^{z). 

Moreover, if M2 does not reject, then 

(a) if COMPi{q) = verify, then M2 checks that reduct{C0MP2{q))[k + 1] = comp^{z) (M2 
verifies the symbol currently scanned). Furthermore, if COMP^{q) € {0, 1} and z G Aqi, 
M2 also assures that COMP^[q) = comp^[z) (M2 verifies that the currently scanned symbol 
holds the most up-to-date information). 

(b) if COMPi{q) = ignore, COMP^{q) G {0, 1}, and z G Aqi, then M2 assures that COMP^{q) / 
comp^{z) (M2 verifies that the information in the currently scanned symbol, z, is out-of- 
date). 

Then M2 (in both cases of C0MP4^{q)) enters some state p such that COMPi{q) = COMPi{p) and 
ifP ^ Qi, then COMP^^p) = COMP^{p) = neutral and COMPi{p) = comp^{z) for i G {2,3}. 

(16) Let p G Q2 - {{ACCEPT, REJECT} U Q22)- 

(a) There is some left computation on prefix G in which M2 reaches state p if and only 
if there is some left computation on prefix h{X,0a) that puts Mi in state q = COMPi{p). 

(b) There is some right computation on prefi^ za after which M2 enters state p where z G 
V2,Oi G r2 starting in state p' if and only if there is some right computation on prefix 
h(z,a) after which Mi enters state COMPi{p) starting in state COMPi{p'). 

(11-13) concern rewrite steps, (14-15) concern move-right steps, and (16) is the main statement 
that ensm'es this proof works (vahd simulations). 

(14) ensures that M2 can update tape contents after reading a compound symbol from A, but 
that it should not verify that the rewrite guess indicated in this information is correct (C0MP4(g) = 
neutral). In fact, this verification should have taken place directly following the rewrite (in the 
same cycle) as is indicated in (13) (C0MP4(g) G {verify, ignore}). Points (13-15) together indicate 
that M2 can only be in a state with fourth component equal to a member of {verify, ignore} at 
most once in a cycle: verification of the rewrite guess happens during a single move-right step in 
the same cycle. By the same token, M2 can only be in a state with fifth component non-equal to 
neutral during the same single move-right step of the cycle: verification of the updated-ness of the 
last symbol under the lookahead can happen only in the step after a rewrite, since move-right steps 
are only defined with respect to the first symbol under the lookahead. 

(12) ensures that M2 can detect when an update of the tape contents has been written onto the 
tape. (15) permits M2 to keep track of cycle orders, to the extent that is necessary here. (See Lemma 

m) 

Prom Remark [HI we easily obtain the following three facts: 

Lemma 9. At no time t in M2 's computation C is there an interior square boundary vr on M2 's 
work tape @t,c such that TL{'K,t) GriUABU{(f} andTji{Tr,t) G Aqi. (No symbol fromTiUAB^{(l:} 
directly precedes a symbol from Aqi on M2 's work tape at any time t in the computation.) 

Proof. This follows from (11-14). □ 

Corollary 10. M2 cannot read a symbol from Aqi in a state from Qi. 

The following Matching Lemma shows that M2 can detect the order of rewrites over consecutive 
tape squares. 

^By prefix m a right computation we mean the prefix of tlie segment of work tape contents following the rewrite. 
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Lemma 11 (Matching Lemma). At time t in M2 's computation C let tt be an interior tape square 
boundary on M2's work tape Qt.c- Suppose TL{Tr,t) € A — and TR{Tr,t) G Aqi- Then there are 

two cycles Cj^,Cj,^ G C, such that 

1. M2 uses rewrite pi = {qi,Xi[l,k\^yi[l,k — at time ti in Cj^ (i € [2]) such that Cj-^ intro- 
duced compi{TL{TT,t)) = comp-^^{yi[k — 1]), and Cj^ introduced comp^{TR(7r , t)) = comp^(y2[l\) G 
{0,1}. 

2. (a) comp^{TL{7^,ti)) = comp^{Tji{TT,t2)), implies ti < t2- 
(b) comp^{TL{7T,ti)) 7^ comp^{TR{'K,t2)), implies ti > t2. 

Proof. (1) follows from (II). (2a) follows from (12) and (14). (2b) follows from (13) and (15). □ 

In case (2a) of the Matching Lemma, M2 should update the tape square (in memory) TR{ir,t) as 
it reads it, and in case (2b), M2 should ignore the instruction in TL{7T,t) to update the information 
in tji{tt, t), since it is now "out of date". We also remark that the Matching Lemma helped provide 
the definition of the mappings h and g. 

We now describe the rewrite and move-right instruction for M2 with k > 2. The case for k = 2 
is easily obtained from this by merging the requirements for the first and last symbols in reducts of 
the case k > 2. 

Rewrite steps of M2. Let p = {q,u[l,k + l],v[l, k],q') G Hi. We define a set of M2's rewrites 
required for simulating p of the form 

p' = ip,x[l,k],y[l,k - l],p') CU2 

with the following component requirements. 

1. p = q il p € Qi, and p = (g, p", comp3(rL(7r, t)), neutral, neutral), otherwise, where p" has 
further constraints with respect to x[l]. (See Item (7).) 

2. For p', we have (by (13)) 



P 



{q' , p, comp3(j/[A; — 1]), verify, neutral) if x[k] G Fi U As, 

{q', p, com-p^{y[k — 1]), ignore, coinp3(x[fc])) if x[k] G A — A^ and only in (6b), 

(g', /9, comp3(y[A; — 1]), verify, comp3(a;[/s])) ifx[fc] G A — Ab and only in (6a). 



3. Any x[2,k- 1] G T^'^ such that h{x[l],x[2,k - 1]) = u[2,k- 1]. 

4. y[2,k-2] =v[2,k-2]. 

5. — 1] = (i; [A; — l],p,ci, neutral), with ci G {0, 1}, by (II). 

6. (a) any x[k] G F2 such that h{x[k — l],x[k]) = u[k], or 

(b) anya;[A;] G A — Ab such that reduct(comp2(x[A;]))[A;] =n[A;+l], and comp3(x[A;]) G {0,1}. 

7. Finally for x[l], y[l], 

• If p G Qi, then y[l\ = v[l] and any a;[l] G F2 U {$} such that comp^(x[l]) = u[l] will 
suffice. 

• lip e Q21, then y[l] = (^[l], B, neutral, C0MP3(p)) and 

- any x[l] G (F2 U {$}) - Aqi such that compi(x[l]) = redex(C0MP2(p))[A; + 1] and 
reduct(C0MP2(p))[A;] = n[l], or 
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— any x[l] G Aqi such that 

* C0MP3(p) ^ comp4(a;[l]), compi(x[l]) = redex(C0MP2(p))[A;+l] and reduct(C0MP2(p))[/i;] = 
u[l\, or 

* C0MP3(p) = comp4(a;[l]) and conipj^(x[l]) = u[l]. 
by the Matching Lemma. 



There are no other rewrites in 62- 

Note that M2 cannot rewrite over the right sentinel, since it always simulates Mi 's rewrites using 
only the first k symbols and Mi has fixed rewrite size. 

Move-right steps of M2 not derived from M2's move-right steps. We suppose without loss 
of generality that Mi doesn't rewrite over the right sentinel and then immediately halt. There are 
two types of move-right steps for M2 that are not derived from Mi's move-right steps, for verifying 

rewrite guesses; they are therefore derived from Mi's rewrites. These two cases, for 52{p,x[l, k]) are 
when p G Q21 with C0MP4(;j) G {verify, ignore}. In these move-right steps, M2 simply verifies that 
Invariant (15) is maintained and, if so, moves right and into state 



indicating that M2 remains in the "same" state (with respect to Mi's state), picks up a;[l]'s verifica- 
tion information (in case it must update tape contents), and its matching information (to keep track 
of the order of rewrites). The fourth and fifth components are always neutral in the compound 
state following any step that does not verify a rewrite step. 

Move-right steps of M2 derived from M2's move-right steps. Other than the above de- 
scribed move-right steps, M2's move-right steps nondeterministically simulate those of Mi simul- 
taneously updating tape contents because of rewrite guesses. Recall that since Mi is in the RR- 
semidet-form, we only need to consider the first symbol under the lookahead for Mi's move-right 
steps (so, in particular, we can talk about move-right steps in Si on a lookahead contents of size k 
instead of k + 1). 



be a move-right step for Mi . 

Firstly, q' G S2{q,u[l]x[2,k]), for all h{x[l],x[2,k]) = u[2,k]. 
In addition, M2 has the following instructions: 

If Ml acccpts/rcjccts/restarts with less than k + 1 symbols under the lookahead, then so can 
M2; that is, if 6i{q,u[l,j]) = ACCEPT (resp. REJECT, RESTART) for 1 < j < k, with = $, then 
52{p,x[l,j]) = ACCEPT (resp. REJECT, RESTART) with x[j] = $ and such that COMPi(p) = q, and for 
all x[l,j - 1] G (r2 U {e}) • ri'^ such that g{q,x[l,j - 1]) = u[l,j - 1]. 

In the remaining description, we describe the simulation move-right steps in which Mi always 
has k + 1 symbols under the lookahead. 

If q' = ACCEPT (so u[k + 1] = $), then we have, for g^ji^fc] G Q22, 9«[i,jfc] G d2{p,x[l,k]), and 



for all p such that COMPi(p) = q and C0MP4(p) = C0MP5(p) = neutral, and for all 2; G r2, and for 
all x[l,k] G (r2 U {$}) • T^'^ such that g{p,x[l,k]) = u[l,k]. Here, M2 first guesses that Mi would 




Let 



q' edi{q,u[l,k + l]) 



(6) 
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accept and then verifies its guess. We must have C0MP4(p) = C0MP5(p) = neutral, because in the 
step after rewriting we have assumed that Mi does not immediately halt after rewriting. 

If q' = REJECT, then we have simply REJECT S 62{p,x[l, k]) for all p such that COMPi(p) = q and 
for ah x[l,k] G U {$}) • Tg"^ such that g{p,x[l,k]) = u[l,k], so long as C0MP4(p) = C0MP5(p) = 
neutral. M2 can guess that the Mi would reject; if this is not the case, there is still some computation 
that does not reject. 

By Corollary \W\ the remaining cases for the simulation of ([6]) are where M2 reads a compound 
symbol (as the first symbol under the lookahead) and/or is in a compound state. 

Suppose p G Qi, i.e., p = q- By Corollary [lOl we must have x[l] G A — Aqi and therefore 
comp]^(x[l]) = u[l]. Now M2 simply picks up the information in x[l] and moves right as Mi would: 

(g', comp2(x[l]), comp3(x[l]), neutral, neutral) G 52{p,x[l,k]). (7) 

Finally, suppose p G Q21] then COMPi(p) = q. The only case left to treat is where C0MP4(]3) = 
neutral. 

1. Ifx[l] G (r2U{(|;})-Aoi, then compi(x[l]) = redex(C0MP2(p))[A:+l] and reduct(C0MP2(p))[A:] = 
u[l]. 

2. If x[l] G Aqi. Then by the Matching Lemma, 

(a) C0MP3(p) / comp4(x[l]), comp^(x[l]) = redex(C0MP2(p))[A; + l] and reduct(C0MP2(p))[A;] = 

u[l], or 

(b) C0MP3(p) = comp4(x[l]) and compi(x[l]) = u[l]. 

M2 rejects for all other contexts (except where it can rewrite). 

M2's rewrite and move-right steps being entirely determined by Mi's, it follows that C{Mi) = 
C{M2). □ 

As a corollary of Theorem [3 we have the following lookahead hierarchy collapsal. 

Corollary 12. For k >2 and X G {(left-, right-left-)mon, X}, we have 

00 

C{X-RRWW) = y C{X-RRWW{k)) = C{X-RRWW{2)) 

k=2 

Corollary [12] reduces the most important question concerning restarting automata — whether 
the separation of rewrite and restart steps results in an increase in power — to the same question 
about restarting automata with lookahead length 2: C{RWW) = C{RRWW) C{RWW) = 

C{RRWW{2)). Theorem[7]also leads to an improvement on a result of [6j with the following corollary, 
which was proven for A; > 3 (Corollary [T3|) . as well as a corresponding corollary for right-left- 
monotonicity (Corollarv ll4p . 

Corollary 13. For all k > 2 and X G {left-mon, man}, we have C{X-RRWW{k)) = CFL. 
Corollary 14. For all k>2, we have C{right-left-RRWW{k)) =LIN. 

4 Concluding Remarks 

We showed that the restriction on lookahead length is not as important a restriction for restarting 
automata with auxiliary symbols as opposed to those without auxiliary symbols, so long as restart 
and rewrite steps are separated, distinguishing only two different language classes for RRWW au- 
tomata. The respective question for RWW automata remains open. 
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